A Fun and Engaging Interface for Crowdsourcing Named Entities

نویسندگان

  • Kara Greenfield
  • Kelsey Chan
  • Joseph P. Campbell
چکیده

There are many current problems in natural language processing that are best solved by training algorithms on an annotated in-language, in-domain corpus. The more representative the training corpus is of the test data, the better the algorithm will perform, but also the less likely it is that such a corpus has already been annotated. Annotating corpora for natural language processing tasks is typically a time consuming and expensive process. In this paper, we provide a case study in using crowd sourcing to curate an in-domain corpus for named entity recognition, a common problem in natural language processing. In particular, we present our use of fun, engaging user interfaces as a way to entice workers to partake in our crowd sourcing task while avoiding inflating our payments in a way that would attract more mercenary workers than conscientious ones. Additionally, we provide a survey of alternate interfaces for collecting annotations of named entities and compare our approach to those systems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The GATE Crowdsourcing Plugin: Crowdsourcing Annotated Corpora Made Easy

Crowdsourcing is an increasingly popular, collaborative approach for acquiring annotated corpora. Despite this, reuse of corpus conversion tools and user interfaces between projects is still problematic, since these are not generally made available. This demonstration will introduce the new, open-source GATE Crowdsourcing plugin, which offers infrastructural support for mapping documents to cro...

متن کامل

PhotoPlay: A Collocated Collaborative Photo Tagging Game on a Horizontal Display

We are exploring the use of collaborative games to generate meaningful textual tags for photos. We have designed PhotoPlay to take advantage of the social engagement typical of board games and provide a collocated ludic environment conducive to the creation of text tags. We evaluated PhotoPlay and found that it was fun and socially engaging for players. The milieu of the game also facilitated p...

متن کامل

Towards hybrid NER: an extended study of content and crowdsourcing-related performance factors

This paper explores the factors that influence the human component in hybrid approaches to named entity recognition (NER) in microblogs, which combine state-of-the-art automatic techniques with human and crowd computing. We identify a set of content and crowdsourcing-related features (number of entities in a post, types of entities, content sentiment, skipped truepositive posts, average time sp...

متن کامل

An Extended Study of Content and Crowdsourcing-related Performance Factors in Named Entity Annotation

Hybrid annotation techniques have emerged as a promising approach to carry out named entity recognition on noisy microposts. In this paper, we identify a set of content and crowdsourcing-related features (number and type of entities in a post, average length and sentiment of tweets, composition of skipped tweets, average time spent to complete the tasks, and interaction with the user interface)...

متن کامل

Named Entity Oriented Difference Analysis of News Articles and Its Application

To support the efficient gathering of diverse information about a news event, we focus on descriptions of named entities (persons, organizations, locations) in news articles. We extend the stakeholder mining proposed by Ogawa et al. and extract descriptions of named entities in articles. We propose three measures (difference in opinion, difference in details, and difference in factor coverage) ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016